Named Entity Recognition with a Maximum Entropy Approach
نویسندگان
چکیده
The named entity recognition (NER) task involves identifying noun phrases that are names, and assigning a class to each name. This task has its origin from the Message Understanding Conferences (MUC) in the 1990s, a series of conferences aimed at evaluating systems that extract information from natural language texts. It became evident that in order to achieve good performance in information extraction, a system needs to be able to recognize names. A separate subtask on NER was created in MUC6 and MUC-7 (Chinchor, 1998). Much research has since been carried out on NER, using both knowledge engineering and machine learning approaches. At the last CoNLL in 2002, a common NER task was used to evaluate competing NER systems. In this year's CoNLL, the NER task is to tag noun phrases with the following four classes: person (PER), organization (ORG), location (LOC), and miscellaneous (MISC). This paper presents a maximum entropy approach to the NER task, where NER not only made use of local context within a sentence, but also made use of other occurrences of each word within the same document to extract useful features (global features). Such global features enhance the performance of NER (Chieu and Ng, 2002b).
منابع مشابه
A Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features
Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...
متن کاملMaximum Entropy Approach based Named Entity Recognition in Punjabi Language
Named Entity Recognition is the task of identifying and classifying named entities into some predefine categories like person, location, organization etc. NER is used in many applications like text summarization, text classification, question answering and machine translation systems etc. For English a lot of work has already been done in the field of NER, where capitalization is a major key fo...
متن کاملNamed Entity Recognition: A Maximum Entropy Approach Using Global Information
This paper presents a maximum entropy-based named entity recognizer (NER). It differs from previous machine learning-based NERs in that it uses information from the whole document to classify each word, with just one classifier. Previous work that involves the gathering of information from the whole document often uses a secondary classifier, which corrects the mistakes of a primary sentencebas...
متن کاملRanking Algorithms for Named Entity Extraction: Boosting and the Voted Perceptron
This paper describes algorithms which rerank the top N hypotheses from a maximum-entropy tagger, the application being the recovery of named-entity boundaries in a corpus of web data. The first approach uses a boosting algorithm for ranking problems. The second approach uses the voted perceptron algorithm. Both algorithms give comparable, significant improvements over the maximum-entropy baseli...
متن کاملME-CSSR: an Extension of CSSR using Maximum Entropy Models
In this work an extension of CSSR algorithm using Maximum Entropy Models is introduced. Preliminary experiments to perform Named Entity Recognition with this new system are presented.
متن کاملMaximum Entropy Models for Named Entity Recognition
In this paper, we describe a system that applies maximum entropy (ME) models to the task of named entity recognition (NER). Starting with an annotated corpus and a set of features which are easily obtainable for almost any language, we first build a baseline NE recognizer which is then used to extract the named entities and their context information from additional nonannotated data. In turn, t...
متن کامل